Answering the Most Correlated N Association Rules Efficiently

نویسندگان

  • Jun Sese
  • Shinichi Morishita
چکیده

Many algorithms have been proposed for computing association rules using the support-confidence framework. One drawback of this framework is its weakness in expressing the notion of correlation. We propose an efficient algorithm for mining association rules that uses statistical metrics to determine correlation. The simple application of conventional techniques developed for the support-confidence framework is not possible, since functions for correlation do not meet the antimonotonicity property that is crucial to traditional methods. In this paper, we propose the heuristics for the vertical decomposition of a database, for pruning unproductive itemsets, and for traversing a setenumeration tree of itemsets that is tailored to the calculation of the N most significant association rules, where N can be specified by the user. We experimentally compared the combination of these three techniques with the previous statistical approach. Our tests confirmed that the computational performance improves by several orders of magnitude.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using a Data Mining Tool and FP-Growth Algorithm Application for Extraction of the Rules in two Different Dataset (TECHNICAL NOTE)

In this paper, we want to improve association rules in order to be used in recommenders. Recommender systems present a method to create the personalized offers. One of the most important types of recommender systems is the collaborative filtering that deals with data mining in user information and offering them the appropriate item. Among the data mining methods, finding frequent item sets and ...

متن کامل

Extensible markup Language approximate query answering Using data mining, intentional based on Tree-Based Association Rules

With the increasing popularity of XML for data representations, there is a lot of interest in searching XML data. Due to the structural heterogeneity and textual content’s diversity of XML, it is daunting for users to formulate exact queries and search accurate answers. Therefore, approximate matching is introduced to deal with the difficulty in answering users’ queries, and this matching could...

متن کامل

Identifying and Evaluating Effective Factors in Green Supplier Selection using Association Rules Analysis

Nowadays companies measure suppliers on the basis of a variety of factors and criteria that affect the supplier's selection issue. This paper intended to identify the key effective criteria for selection of green suppliers through an efficient algorithm callediterative process mining or i-PM. Green data were collected first by reviewing the previous studies to identify various environmental cri...

متن کامل

An Efficient Method for Mining Association Rules with Item Constraints

Most existing studies on association rules discovery focused on finding the association rules between all items in a large database that satisfy user-specified minimum confidence and support. In practice, users are often interested in finding association rules involving only some specified items. Meanwhile, based on the search results in former queries, users tend to change the minimal confiden...

متن کامل

Mining negative association rules

The focus of this paper is the discovery of negative association rules. Such association rules are complementary to the sorts of association rules most often encountered in literatures and have the forms of X→¬Y or ¬X→Y. We present a rule discovery algorithm that finds a useful subset of valid negative rules. In generating negative rules, we employ a hierarchical graph-structured taxonomy of do...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002